Build Streaming data pipelines
Based on Cloud Pub Sub and Example of End-to-end Data Pipeline.
Set streaming option to true:
options.view_as(StandardOptions).streaming = True
Reading from stream
instead of beam.io.readFromTxt()
, we must use beam.io.ReadFromPubSub()
Then remove the pipelines:
- cleaned_data
- delivered_orders
- other_orders This because streams are unbounded data and can't be grouped
Adding wait_until_finish()
adding after ret=p.run()
:
ret.wait_until_finish()
Note that the code after this line won't run until the finish of the pub/sub. If we want to launch other commands we must creat a new thread and launch it before
wait_until_finish()